Active graph based semi-supervised learning using image matching: Application to handwritten digit recognition

نویسنده

  • Hubert Cecotti
چکیده

With the availability of large amounts of documents and multimedia content to be classified, the creation of new databases with labeled examples is an expensive task. Efficient supervised classifiers often require large training databases that are not always immediately available. Active learning approaches solve this issue by querying an expert to set a label to particular instances. In this paper, we present a novel active learning strategy for the classification of handwritten digits. The proposed method is based on a k-nearest neighbor graph obtained with an image deformation model, which takes into account local deformations. During the active learning procedure, the user is first asked to label the vertices with the highest number of neighbors. Thus, the expert sets the label to the examples that are more likely to propagate their labels to a high number of close neighbors. Then, a label propagation function is performed to automatically label the examples. The procedure is repeated until all the images are labeled. We evaluate the performance of the method on four databases corresponding to different scripts (Latin, Bangla, Devnagari, and Oriya). We show that it is possible to label only 332 images in the MNIST training database to obtain an accuracy of 98.54% on this same database (60000 images). The robustness of the method is highlighted by the performance of handwritten digit recognition in different scripts. © 2016 Elsevier B.V. All rights reserved. c t g t t t i o i u a t h t p i r i T

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combining Committee-Based Semi-supervised and Active Learning and Its Application to Handwritten Digits Recognition

Semi-supervised learning reduces the cost of labeling the training data of a supervised learning algorithm through using unlabeled data together with labeled data to improve the performance. Co-Training is a popular semi-supervised learning algorithm, that requires multiple redundant and independent sets of features (views). In many real-world application domains, this requirement can not be sa...

متن کامل

Graph Based Semi-supervised Learning in Computer Vision

OF THE DISSERTATION Graph Based Semi-Supervised Learning in Computer Vision by Ning Huang Dissertation Director: Joseph Wilder Machine learning from previous examples or knowledge is a key element in many image processing and pattern recognition tasks, e.g. clustering, segmentation, stereo matching, optical flow, tracking and object recognition. Acquiring that knowledge frequently requires huma...

متن کامل

Sparse Modeling of High - Dimensional Data for Learning and Vision

Sparse representations account for most or all of the information of a signal by a linear combination of a few elementary signals called atoms, and have increasingly become recognized as providing high performance for applications as diverse as noise reduction, compression, inpainting, compressive sensing, pattern classification, and blind source separation. In this dissertation, we learn the s...

متن کامل

Neural Network Based Recognition System Integrating Feature Extraction and Classification for English Handwritten

Handwriting recognition has been one of the active and challenging research areas in the field of image processing and pattern recognition. It has numerous applications that includes, reading aid for blind, bank cheques and conversion of any hand written document into structural text form. Neural Network (NN) with its inherent learning ability offers promising solutions for handwritten characte...

متن کامل

Persian Handwritten Digit Recognition Using Particle Swarm Probabilistic Neural Network

Handwritten digit recognition can be categorized as a classification problem. Probabilistic Neural Network (PNN) is one of the most effective and useful classifiers, which works based on Bayesian rule. In this paper, in order to recognize Persian (Farsi) handwritten digit recognition, a combination of intelligent clustering method and PNN has been utilized. Hoda database, which includes 80000 P...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Pattern Recognition Letters

دوره 73  شماره 

صفحات  -

تاریخ انتشار 2016